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A  Static  Semantics 

A.l  Mutually  Recursive  Type  Declarations 
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Figure  1 ;  Mutually  Recursive  Type  Declaration  Checking 

The  type  declaration  judgement  in  the  paper  only  supports  recursive  types.  Here,  we  include 
support  for  mutually  recursive  types  hy  splitting  the  three  key  premises  of  the  paper  rule  into  three 
distinct  judgements,  which  each  process  the  entire  list  of  type  declarations  all  the  way  through 
before  going  on  to  the  next  one.  The  three  additional  judgements  in  Fig.  1  operate  as  follows: 
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1.  The  judgement  hog  6  ^names  ©names  Creates  a  named  type  context  ©names  containing  only 
type  names  from  6,  checking  only  that  they  are  unique.  Each  binding  in  ©names  is  of  the  form 

r[?,?]. 

2.  The  judgement  hog  9  ~defs  ©defs  creates  a  named  type  context  ©defs  containing  only 
type  names  and  their  definitions,  checking  only  that  any  named  types  mentioned  in  the  type 
definitions  are  available.  Each  binding  in  ©names  is  of  the  form  T[5,  ?]. 

3.  The  judgement  keoBdefs  ^  '^metadata  ©  finally  checks  that  the  metadata  is  well-typed.  Meta¬ 
data  can  explicitly  refer  to  metadata  of  a  type  listed  earlier  in  the  list  of  type  declarations,  6, 
but  any  other  reference  is  a  type  error. 

A.2  Context  Formation 


h  © 


kfi  S 


he  n 


_  h  ©  T  ^  dom(©)  he^j’p^?]  5  h0_T[5,?]  ft 

R  TeRlM  Th-extend 


; — ^  def-unknown  .  def-ot  ■ — ^  3  .  def-ct 
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metadata-unknown  ^  ^  . - metadata 


he? 


he  i  :  r 


he  r 


— —  G-empty  ^ — 

he  (/)  he  1 ,0:  :  r 


G- extend 


Figure  2:  Context  Formation 

In  our  metatheory,  we  need  judgements  expressing  well-formed  contexts,  shown  in  Fig.  2.  A 
lemma  corresponding  to  Femma  3  in  the  paper  applies  to  our  definition  here  as  well: 

Lemma  1  (Type  Declaration  (Mutually  Recursive)),  //"h  ©o  and  heg  9  ^  Q  then  h  ©q©. 

Proof.  We  prove  the  following  more  explicit  lemma  after  inverting  the  type  declaration  derivation. 

□ 

Lemma  2  (Type  Declaration  (Explicit)).  T/’h  ©o  and  heo  9  names  ^ names  then  h  names  and  if 
heoOnamej  ^  ^ defs  ^ defs  then  h  ©o©rfe/i  and  (/"heQe^^/j  h  ^metadata  ©  then  h  ©0©. 

Proof  The  proof  is  by  induction  on  the  structure  of  9.  We  give  the  case  9  =  objtype[T,  cj,  e^];  9' 
(the  case  0  =  0  is  trivial  and  the  case  9  =  casetype[T,  x,  e^]  follows  a  directly  corresponding 
argument).  We  have: 
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•  By  rule  OT-names  (which  is  the  only  rule  that  syntactically  applies)  we  have  that  ©names  = 
r[?,  ?];  ©names  ^nd  T  ^  dom(©o)  and  T  ^  dom(©;,^,J  and  by  the  IH,  h  ©o©Lmes-  There¬ 
fore,  h  ©o©names)  T[?,  ?]  by  rulcs  Th-cxtend,  def-unknown  and  metadata-unknown.  Thus, 
h  ©0  ©names  by  Suitable  application  of  an  exchange  lemma,  which  we  have  assumed  by 
metatheoretically  defining  ©  as  a  finite  map  over  type  names. 

•  By  rule  OT-defs  (which  is  the  only  rule  that  syntactically  applies)  we  have  that  ©defs  = 

r[ot[ce;],  ?];  and  u  and  by  the  IH,  h  ©o©defs-  Therefore,  h  ©o©defs>  T[ot[a;],  ?] 

by  rules  Th-extend,  def-ot  and  metadata-unknown.  By  exchange,  we  have  that  h  ©o©defs- 

•  By  rule  OT-metadata  (which  is  the  only  rule  that  syntactically  applies),  we  have  that  ©  = 

T[oi[uj\,im  '■  Tm]]  ©'  and  0  ^  Tm  and  by  the  external  type  preservation 

lemma,  0  im  Tm-  By  rules  Th-extend,  def-ot  and  metadata  we  have  that  h 

©0,  T[ot[cj],  im  '■  Tm].  By  the  IH,  we  then  have  that  h  ©o©. 

□ 


A.3  Metatheoretic  Functions 

We  use  several  metatheoretic  functions.  Their  properties  are  defined  below. 

Definition  1.  7/'parsestreain(6o(iy)  =  ipg  then  0  l-©(,  ips  named[P ar seStream]. 

Definition  2.  If\-QQ  ips  <=  named[Par seStream]  then  there  exists  a  body  such  that  body (ipg)  = 
body  and  parsestreani(6o(i|/)  =  ipg. 

Definition  3.  If  epa.Tse{body)  =  e  then  e  is  the  abstract  syntax  corresponding  to  the  concrete 
syntax  in  body,  as  described  in  the  paper. 

A.4  Notes  on  Reification  and  Dereification 

Lemma  1  in  the  paper  (Reification)  requires  additional  clauses  for  completeness: 

Lemma  3  (Reification  (Full)).  IfQoCQ  then 

1.  If\-0T  then  T  X  i  (^nd  0  h©  i  named[Type]. 

2.  X  i  i  and  0  h©  i  named[ID].^ 

3.  i  ii  and  0  h©  z  ■<—  named[ID] 

4.  C  fi  and  0  h©  z  ■<—  named[ID] 

5.  T  I  i  and  0  h©  z  named[I D] 

'Note  that  this  judgement  is,  perhaps  confusingly,  about  the  metavariable  x,  not  the  internal  term  form  for  variables. 
Our  syntax  in  the  paper  does  not  distinguish  these  directly,  as  is  conventional,  so  rule  R-var  looks  self-referential. 
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Proof.  The  proof  for  types  is  immediate  by  inspection.  The  remaining  clauses  are  assumed  def- 
initionally,  as  we  do  not  wish  to  prescribe  particular  grammars  for  variables,  labels,  constructor 
labels  and  type  labels.  □ 

For  completeness,  we  can  also  state  that  every  reified  elaboration  can  be  dereified: 

Lemma  4  (Completeness  of  Dereification).  7/’0o  C  0  and  0  he  i  named[Exp]  and  iva.1  then 

i  t  e. 

Proof.  The  proof  is  a  simple  induction  that  simply  checks  for  coverage.  □ 

A.5  Notes  on  Internal  Type  Safety  and  Type  Preservation 

Theorem  1,  Theorem  2  and  Lemma  2  in  the  paper  require  a  slightly  stronger  inductive  hypothesis. 
We  can  prove  the  following  stronger  theorems  instead. 

Theorem  1  (Internal  Type  Safety  (Strong)).  If\-Q  then 

1.  If%\-Qi^T  then  either  i  val  or  i'  such  that  0  h©  T  r. 

2.  //’0  he  i  ^  r,  then  either  i  val  or  i  ^  i'  such  that  0  he  h  r. 

Theorem  2  (External  Type  Preservation  (Strong)).  If\-Q  and  he  T  then 

1.  IfV  he  e  i  <^=  r  then  T  he  i  ^  t. 

2.  IfV  \-Q  e  i  ^  T  then  T  he  i  ^  t. 

Lemma  5  (Translational  Type  Preservation  (Strong)).  lf\-  Qandh^  Voutand\-Q  V  and domiV out)i^ 
dom(T)  =  0  (which  we  can  assume  implicitly  due  to  alpha  renaming  at  binders)  then 

1.  If  Tout]  V  'tq  i  ^  T  then  Vout^  he  i  ^  r. 

2.  If  Tout]  r  he  e  i  ^  T  then  TgutT  Tq  i  ^  t. 
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B  Corpus  Analysis  Methodology 

Due  to  space  limitations  for  our  ECOOP’14  paper,  some  of  the  details  of  the  methodology  used 
to  perform  the  corpus  analysis  were  omitted.  Here  we  restore  the  omission  and  provide  a  more 
detailed  description  of  the  used  methodology. 

For  our  analysis  we  used  a  recent  version  (20130901r)  of  the  Qualitas  Corpus  [2],  consisting  of 
107  Java  projects.  As  the  projects  in  the  corpus  contained  various  types  of  files  that  are  necessary 
for  a  correct  project  setup  in  a  programming  environment  but  we  were  interested  solely  in  the  .java 
files,  we  ran  the  following  commands  in  a  terminal  to  remove  all  the  non- Java  files: 

1  find  .  -type  f  -not  -name  "*.java"  -exec  rm  {}  \; 

2  find  .  -type  d  -empty  -delete 

Having  obtained  a  set  of  exclusively  Java  files,  we  ran  the  following  commands  to  find  all 
constructors  in  the  code: 

1  grep  -r  -n  '[[; space ;]] ^public [[: space :]]* [A-Z] [a-zA-Z0-9_- ]*[[: space :]]*( '  .  >-/tmp/public-constructors . 

txt  2>/dev/null; 

2  grep  -r  -n  '[[; space ;]] ^protected [[: space :]]* [A-Z] [a-zA-Z0-9_- ]*[[: space ;]]*( '  .  >~/tmp/protected- 

constructors.txt  2>/dev/null; 

3  grep  -r  -n  '[[: space :]] *private [[: space :]]* [A-Z] [a-zA-Z0-9_- ]*[[: space :]]*( '  .  >~/tmp/private- 

constructors.txt  2>/dev/null; 

4  grep  -r  -n  "'j [: space :]]* [A-Z] [a-zA-Z0-9_- ]*[[: space: ]]*( '  .  >~/tmp/package-private-constructors .txt  2>/ 

dev/null ; 

These  commands  use  several  observations  pertaining  to  Java  constructors: 

1 .  Constructor  names  start  with  a  capital  letter. 

2.  Constructor  names  are  usually  followed  by  an  opening  parenthesis. 

3.  Constructor  names  may  be  prepended  by  scope-related  keywords,  such  as  public,  protected, 
and  private,  or  may  not  have  any  keywords  in  front  of  them  (which  means  they  are  package 
private). 

4.  Constructor  names  or  their  scope-related  keywords  may  have  zero  or  more  whitespace  char¬ 
acters  before  them  and  no  other  characters. 

5.  There  may  be  zero  or  more  whitespace  characters  between  the  constructor  name  and  an 
opening  parenthesis  and  no  other  characters. 

According  to  these  observations,  the  first  command  is  to  find  all  public  constructors,  second  all 
protected  constructors,  third  all  private  constructors,  and  fourth  all  package -private  constructors. 
After  the  four  corresponding  files  were  created,  we  did  a  quick  visual  scan  through  to  verify  that 
only  constructors  were  found,  merged  the  four  files  into  one,  and  concluded  that  there  were  124,873 
Java  constructors. 

Having  obtained  a  collection  of  constructors,  we  used  the  following  vi  text  editor’s  command 
to  find  constructors  that  take  in  at  least  one  string  argument: 

1  :g/String  / 
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Running  this  command,  we  found  that  there  were  30,161  constructors,  i.e.,  24%  of  the  total, 
that  had  at  least  one  string  argument.  Then,  we  searched  for  constructors  that  used  strings  that 
could  be  substituted  with  TSLs.  We  did  a  visual  scan  of  the  constructors’  signatures  and  inferred 
the  functionality  of  the  constructor  and  the  arguments  it  was  taking  in.  To  give  an  intuition  for  how 
the  inference  process  went,  below  we  give  a  positive  example,  i.e.,  an  example  of  a  constructor 
in  which  we  concluded  that  a  TSL  could  be  used  instead  of  the  string  argument,  and  a  negative 
example,  i.e.,  an  example  of  a  constructor  in  which  we  concluded  that,  although  the  constructor 
had  a  string  argument,  using  a  TSL  instead  might  not  have  benefited  the  implementation. 

Positive  Example  Consider  the  following  example,  which  was  positively  classified  in  our  code 
analysis: 

Constructor:  public  IPAddressConversion(String  IPaddr) 

File:  j topen- 7 . l/com/ibm/as400/util/commtrace/IPAddressCon vers ion . j ava 
Line:  51 

This  constructor  was  found  in  the  JROpen  project,  on  line  5 1  in  a  file  called  iPAdd  ressconversion  .java. 
The  constructor  is  called  iPAddressConversion  and  the  name  of  the  string  argument  is  iPaddr.  From  these 
pieces  of  information,  we  inferred  that  the  string  argument  representing  an  IPv4  address,  which  is 
usually  of  the  following  form  where  d  is  a  number  between  0  and  255: 

D.D.D.D 

This  format  could  be  represented  by  a  Wyvern  TSL  where  we  support  variable  splicing  using 
the  format  %x: 

1  objtype  IPAddress 

2 

3  metadata  =  new  :  HasTSL 

4  val  parser  =  ~ 

5  start  <-  D  ' . '  D  '  .  '  D  ' .  '  D 

6  fn  (el,  e2,  e3,  e4)  =>  ~ 

7  new 

8  val  dl  =  °,el% 

9  val  d2  =  %e2% 

10  val  d3  =  ",.e3% 

11  val  d4  =  %e4% 

12  D  <-  numlit 

13  fn  (el)  =>  el 

14  D  <-  ID 

15  fn  (el)  =>  el 

Hence,  the  constructor  could  use  a  TSL  instead  of  the  string  argument  and  thus  benefit  from  a 
guarantee  that  at  the  runtime  the  passed-in  argument  adheres  to  the  necessary  format. 

Negative  Example  A  negative  example  in  our  inference  process,  i.e.,  a  constructor  that  has  at 
least  one  string  argument  but  may  not  benefit  from  substituting  it  with  a  TSL,  is  presented  below: 

Constructor:  public  InternalEntity(String  name.  String  text,  boolean  inExternalSubset) 

File:  xerces -2.10. 0/s  rc/o  rg/apache/xe  rces/impl/XMLEntityManager . j  ava 
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1.  Constructor:  public  CatalogEntry(String  publicID,  CatalogReader  catalog) 

File:  netbeans-  6 . 9 . 1/xml .  catalog/src/org/ net  bean  s/modules/xml/ 

catalog/CatalogEntry. java 

Line:  64 

2.  Constructor:  public  ConfigurableAwtMenu(String  menuID,  VariableBundle  vars) 

File:  pooka  - 3 . 0  - 080505/net/s uberic/util/gui/ConfigurableAwtMenu .  j ava 
Line:  35 

3.  Constructor:  public  ExternalRuleID(String  id) 

File:  pmd-4. 2 . 5/src/net/sourceforge/pmd/ExternalRuleID .java 
Line:  11 

Figure  3:  Examples  of  constructors  in  the  “Identifier”  category  (process  ID,  user  ID,  column  or 
row  IDs,  etc.) 


Line:  2,486 

This  constructor  was  found  in  the  Xerces  project,  on  line  2,486  of  a  file  called  xMLEntityManager.  java. 
The  name  of  the  constructor  is  intemaiEntity,  and  it  has  two  string  arguments:  one  called  name  and 
the  other  called  text.  The  name  of  the  constructor  and  the  names  of  the  passed-in  string  arguments 
are  generic  and  thus  we  cannot  infer  the  exact  functionality  of  the  constructor.  In  turn,  we  cannot 
suggest  a  TSL  to  be  used  to  capture  the  functionality.  Therefore,  it  is  not  obvious  that  the  con¬ 
structor  would  benefit  from  using  a  TSL  instead  of  any  of  its  string  arguments,  and  we  classify  this 
example  as  negative. 

To  give  an  insight  into  all  types  of  strings  that  we  identified,  we  provide  examples  for  each  type: 
Figure  3  presents  examples  for  the  “Identifier”  category;  Figure  4  presents  examples  for  the  “Di¬ 
rectory  path”  category;  Figure  5  presents  examples  for  the  “Pattern”  category;  Figure  6  presents 
examples  for  the  “URL/URI”  category;  Figure  7  presents  examples  for  the  “Other”  category  (con¬ 
taining  several  subtypes  of  strings). 
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1.  Constructor:  public  VersionRelease(String  homeDir) 

File:  j  boss  -  5.1. 0/build/VersionRelease .  j  ava 
Line:  72 

2.  Constructor:  public  DataQueueDocument  (AS400  system,  String  path) 

File:  jtopen-7 .  l/com/ibm/as400/vaccess/DataQueueDocument  .java 
Line:  140 

3.  Constructor:  public  ClassPathContextResource(String  path,  ClassLoader  classLoader) 

File:  springf  ramework-3 . 0. 5/projects/org .  springf  ramework.  core/src/ 

main/] ava/org/springf ramework/core/io/DefaultResourceLoader . j  ava 

Line:  127 

Figure  4:  Examples  of  constructors  in  the  “Directory  path”  category 


1.  Constructor:  public  RegexFilter(String  regex) 

File:  drj ava  -  stable- 20100913-  r5387/src/edu/rice/cs/drjava/config/ 

RecursiveFileListProperty . java 

Line:  61 

2.  Constructor:  public  NameEndsWith(String  suffix) 

File:  struts-2 . 2 .  l/src/xwork-  core/src/main/java/com/opensymphony/ 

xwork2/util/ResolverUtil . j  ava 

Line:  141 

3.  Constructor:  public  NumberEditor(JSpinnerj Spinner,  String  decimalFormat) 

File:  netbeans-6 . 9 . 1/html/s  rc/o  rg/net  bean  s/mod  ules/html/palette/ 

items/OLCustomizer. java 

Line:  303 

Figure  5;  Examples  of  constructors  in  the  “Pattern”  category  (regular  expressions,  prefixes  and 
suffixes,  delimiters,  format  templates,  etc.) 
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1.  Constructor:  public  XConnection(ExpressionContext  exprContext,  String  driver,  String 
dbURL,  String  user,  String  password) 

File:  xalan- 2 . 7 . 1/s rc/org/apache/xalan/lib/sql/XConnection .  j ava 
Line:  239 

2.  Constructor:  public  DOMLocatorImpl(int  lineNumber,  int  columnNumber,  String  uri) 
File:  xerces-2 . 10 . 0/s rc/org/apache/xerces/dom/DOMLocatorlmpl .java 
Line:  82 

3.  Constructor:  public  MockHttpServletRequest(ServletContext  servletContext,  String 
method.  String  requestURI) 

File:  springf  ramework-3.0.5/projects/org .  springf  ramework.web/src/ 

test/] ava/org/springframework/mock/web/MockHttpServlet Request . j  ava 

Line:  223 


Figure  6:  Examples  of  constructors  in  the  “URLAJRI”  category 
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1 .  ZIP  code 

Constructor:  public  Customer(Integer  customerld,  String  zip) 

File:  netbeans-6 . 9 . 1/webs vc .  rest/test/unit/data/tests rc/com/acme/ 

Customer. java 

Line:  69 

2.  Password 

Constructor:  public  WrappedConnectionRequestInfo(final  String  user,  final  String  pass¬ 
word) 

File:  j boss-5 . 1 .0/connector/src/main/org/j boss/resource/adapter/j dbc/ 
WrappedConnectionRequestlnfo .java 

Line:  39 

3.  Query 

Constructor:  public  JDBCXYDataset(Connection  con,  String  query) 

File:  j  f  reecha  rt  - 1 . 0 . 13/ sou  rce/o  rg/ j  f  ree/data/ j  dbc/JDBCXYDataset .  j  ava 
Line:  175 

4.  HTMUXML 

Constructor:  public  HtmlContentPopUp(java.awt.Frame  parent,  String  title,  boolean 
modal.  String  html) 

File:  jag-6 .  l/src/com/finalist/jaggenerator/HtmlContentPopUp  .java 
Line:  88 

5.  IP  address 

Constructor:  public  HostRecord(String  ip.  String  name,  boolean  ssh) 

File:  netbeans-6 .9 . 1/cnd .  remote/src/org/netbeans/modules/cnd/ remote/ 
ui/wiza  rd/HostsListTableModel . j  ava 

Line:  173 

6.  Version 

Constructor:  public  EncryptHeader( short  type.  String  version) 

File:  jg  roups  -  2 . 10 . 0/src/org/jgroups/protocols/ENCRYPT.  j  ava 
Line:  1,147 

Figure  7 :  Examples  of  constructors  in  the  “Other”  category 
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